Clustering and dendrogram

Sex seems to be the determinng factor in hierarchical clustering. Neither disease status nor ethnicity seem to be clustered in any meaningful manner. Also, one sample seems to have a mismatched ‘sex’ label.

Heatmap

Heatmap with the same clustering. Highly distant groups in rows are separated by sex.

PCA

Screeplot

It would take 119 principal components to capture 90% of variance in the data.

Pair plot

Sex by colors. Disease status by shape. The sample with a mismatched ‘sex’ label is visible here too.

PC Heatmap

The heatmap of first 7 PCs.

PC relation with metadata

Seems like the only relevant characterictis separated by first 5 PCs is sex.

Sex

Disease Status

Age

The lighter the point, the higher the age.

Ethnicity